You are encouraged to discuss problem sets with your fellow students (and with the Course Instructor of course), but you must write your own final answers, in your own words. Solutions prepared ``in committee’’ or by copying someone else’s paper are not acceptable. This violates the Brown standards of plagiarism, and you will not have the benefit of having thought about and worked the problem when you take the examinations.
All answers must be in complete sentences and all graphs must be properly labeled.
For the PDF Version of this assignment: PDF
For the R Markdown Version of this assignment: RMarkdown
This homework will use the following data:
Y, jointly even though they do not necessarily predict Y very well individually. There are two predictor variables in the data set, X1 and X2.Y. Comment on whether the predictors seem to relate to Y. What percent of the variability in Y does each predictor explain by itself?lm() to build a multiple regression model using both predictor variables X1 and X2. Comment on the fit and the statistical significance of each predictor variable. What percent of the variability in Y is explained by the model now that both predictors are included? Give an explanation for what you think is happening with both predictors in the model.X1 and X2 predict Y so well together when they do not alone?library(plotly)
hw1a <- read.csv("https://raw.githubusercontent.com/php-1511-2511/php-1511-2511.github.io/master/Data/hw1a.csv")
p <- plot_ly(hw1a, x = ~x1, y = ~x2, z = ~y) %>%
add_markers() %>%
layout(scene = list(xaxis = list(title = 'x1'),
yaxis = list(title = 'x2'),
zaxis = list(title = 'y')))
p
Data set hw1b contains air pollution data from 41 U.S. cities. Our goal is to try to build a multiple regression model to predict SO2 concentration using the other variables.
| Variable Name | Description |
|---|---|
so2 |
SO2 air concentration in micrograms per cubic meter. |
temp |
Average Annual temperature in degrees F. |
empl20 |
The number of manufacturing companies with 20 or more workers. |
pop |
The population in thousands. |
wind |
The average annual wind speeds in miles per hour. |
precipin |
The average annual precipitation in inches. |
precipdays |
The average number of days with precipitation per year. |
tidy1 <- tidy(model1)
tidy2 <- tidy(model2)
rbind(tidy1, tidy2)